Predicting co-evolving pairs in Pfam using information theory where entropy is determined by phylogenetic mutations
نویسنده
چکیده
The accurate prediction of co-evolving pairs in protein sequences plays an important role in tertiary protein structure prediction and protein engineering. Using information theory to detect coevolving pairs is impacted by the phylogenetic effect on entropy measurements. Mutual Information (MI) is used to detect co-evolving pairs in a protein family by re-sampling based on mutation events in the phylogenetic tree (RPE). The predictive quality of co-evolving pairs with high mutual information (Z>=4), a sequence distance > 10 and within 12 angstroms using the RPE method is on average 81% in a Pfam family. The accuracy of detecting co-evolving pairs without RPE is 56%. This study represents the first known analysis of mutual information to detect co-evolving pairs in the full Pfam data set. Results for each protein family in Pfam and corresponding ribbon model graphing the MI relationships can be found at http://www.proteinx3d.com.
منابع مشابه
Using information theory to search for co-evolving residues in proteins
MOTIVATION Some functionally important protein residues are easily detected since they correspond to conserved columns in a multiple sequence alignment (MSA). However important residues may also mutate, with compensatory mutations occurring elsewhere in the protein, which serve to preserve or restore functionality. It is difficult to distinguish these co-evolving sites from other non-conserved ...
متن کاملDisentangling Direct from Indirect Co-Evolution of Residues in Protein Alignments
Predicting protein structure from primary sequence is one of the ultimate challenges in computational biology. Given the large amount of available sequence data, the analysis of co-evolution, i.e., statistical dependency, between columns in multiple alignments of protein domain sequences remains one of the most promising avenues for predicting residues that are contacting in the structure. A ke...
متن کاملInference of Protein-Protein Interactions by Unlikely Profile Pair
We note that a set of statistically “unusual” proteinprofile pairs in experimentally determined database of protein-protein interactions can typify protein-protein interactions, and propose a novel method called PICUPP that sifts such protein-profile pairs using a statistical simulation. It is demonstrated that unusual Pfam and InterPro profile pairs can be extracted from the DIP database using...
متن کاملEvaluation of monitoring network density using discrete entropy theory
The regional evaluation of monitoring stations for water resources can be of great importance due to its role in finding appropriate locations for stations, the maximum gathering of useful information and preventing the accumulation of unnecessary information and ultimately reducing the cost of data collection. Based on the theory of discrete entropy, this study analyzes the density of rain gag...
متن کاملThe Impact of the Spectral Filter Bandwidth on the Spectral Entanglement and Indistinguishability of Photon Pairs of SPDC Process
In this paper, we have investigated the dependence of the spectral entanglement and indistinguishability of photon pairs produced by the spontaneous parametric down-conversion (SPDC) procedure on the bandwidth of spectral filters used in the detection setup. The SPDC is a three-wave mixing process which occurs in a nonlinear crystal and generates entangled photon pairs and utilizes as one of th...
متن کامل